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Abstract. When the initial and transition probabilities of a finite Markov chain in dis- 
crete time are not well known, we should perform a sensitivity analysis. This can be done 
by considering as basic uncertainty models the so-called credal sets that these probabil- 
ities are known or believed to belong to, and by allowing the probabilities to vary over 
such sets. This leads to the definition of an imprecise Markov chain. We show that the 
time evolution of such a system can be studied very efficiently using so-called lower and 
upper expectations, which are equivalent mathematical representations of credal sets. We 
also study how the inferred credal set about the state at time n evolves as « — > <»: under 
quite unrestrictive conditions, it converges to a uniquely invariant credal set, regardless of 
the credal set given for the initial state. This leads to a non-trivial generalisation of the 
classical Perron-Frobenius Theorem to imprecise Markov chains. 



1. Introduction 



One convenient way to model uncertain dynamical systems is to describe them as Mar- 
kov chains. These have been studied in great detail, and their properties are well known. 
However, in many practical situations, it remains a challenge to accurately identify the 
transition probabilities in the Markov chain: the available information about physical sys- 
tems is often imprecise and uncertain. Describing a real-life dynamical system as a Markov 
chain will therefore often involve unwarranted precision, and may lead to conclusions not 
supported by the available information. 

For this reason, it seems quite useful to perform probabilistic robustness studies, or 
sensitivity analyses, for Markov chains. This is especially relevant in decision-making ap- 
ph cations. Many re search ers in Markov Chain Decision Making 10 ESSH— inspired 
bv lSatia & Laver s 11 197311 original work — have paid attention to this issue of 'imprecision' 
in Markov chains. 

Work on the more mathematical aspe cts of modelling s uch imprecision in Markov 
chains was initiated in the early 19 80s by Hartfiel & Senetal (see HI, 14, 15]), under the 



name 'Markov set-chains' . Hartfiel' s work seems to have been unknown to Kozine & Utkin 



1 2111 . who approached the subject from a different angle. Armed with linear programming 
techniques, these authors performed an experimental study of the limit behaviour of Mar- 



kov chains with uncertain transition probabilities. More recently, Skulj 113 iL 13211 has also 



contributed to a formal study of the time evolution and limit behaviour of such systems. 
Markov set-chains can also be seen as special cases of so-called credal networks under 
strong independence 10, 01 ■ 

All these approaches use sets of probabilities to deal with the imprecision in the tran- 
sition probabilities. When these probabilities are not well known, they are assumed to 
belong to certain sets, and robustness analyses are performed by allowing the transition 
probabilities to vary over such sets. This should be contrasted with more common ways of 
performing a sensitivity analysis: looking at small deviations from a reference model and 
evaluating derivatives of important variables in this reference point. 
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As we shall see, the sets of probabilities approach leads to a number of computational 
difficulties. But we will show that they can be overcome by tackling the problem from 
another angle, using lower and upper expectations, rather than sets of probabilities. Our 
new method also makes it fairly easy to formulate and prove convergence (or Perron- 
Frobenius-like) results for Markov chains with uncertain transition probabilities that hold 
under weaker conditions than the ones found by Hartfiel HQ and Skulj Q. We shall 
see that our condition for this convergence, which requires that the imprecise Markov chain 



In the rest of this Introduction, we give an overview of the theory of classical Mar- 
kov chains and formulate the classical Perron-Frobenius theorem. Then, in Sections |2] 
and [3] we introduce imprecise Markov chains and generalise many aspects of the clas- 
sical theory. In Section ID we briefly discuss accessibility relations, which allows us to 
give a nice interpretation to a number of conditions that will turn out to be sufficient for a 
Perron-Frobenius-like convergence result. In Section|5] we generalise the classical Perron- 
Frobenius theorem, and explore the relation of our generalisation with previous work in the 
literature. We discuss a number of theoretical and numerical examples in Section|6] and we 
give perspectives for further research in the Conclusions. Proofs of theorems and proposi- 
tions have been relegated to an appendix. 

1.1. A short analysis of classical Markov chains. Consider a finite Markov chain in 
discrete time, where at consecutive times n — 1,2,3, ... ,A^, £ N the state X{n) of a 
system can assume any value in a finite set Here N denotes the set of non-zero natural 
numbers, and is the time horizon. The time evolution of such a system can be modelled 
as if it traversed a path in a so-called event tree; see Shafer [29]. An example of such a tree 
for j?r — {a,b} and = 3 is given in Figure[T] 

The situations, or nodes, of the tree have the form;ici:;(- :={x\,... ^Xj^) G A; = 0, 1 , . . . ,A^. 
For ^ = there is some abuse of notation as we let j^T" := {□}, where □ is the so-called 
initial situation, or root of the tree. In the cut|l| of □, the value of the state X{n) at 
time n is revealed. 



(a,a,a) (a,a,&) {a,b,a) [a,b,b) {b,a,a) {b,a,b) {b,b,a) [b,b,b) 

Figure 1 . The event tree for the time evolution of system that can be 
in two states, a and b, and can change state at time instants n= 1,2. Also 
depicted are the respective cuts and of □ where the states at 
times 1 and 2 are revealed. 

In a classical analysis, it is generally assumed that we have: (i) a probability distribution 
over the initial state X{l),m the form of a probability mass function mi on and (ii) for 
each situation xi-^ that the system can be in at time «, a probability distribution over the next 
stateX(n+ 1), in the form of a probability mass function q'(-|ji:i:„) on J^. This means that in 
each non-terminal situatiorQjici:„ of the event tree, we have a local probabiUty model telling 

'a cut y of a situation s is a collection of descendants v of « such tliat every patli (from root to leaves) througli 
s goes through exactly one v in V. 

A non-terminal situation is a node of the tree that is not a leaf. 




itrictly weaker than, both Hartfiell 's 
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US about the probabiHties of each of its child nodes. This turns the event tree into a so-called 
probability tree; see Shafer [29, Chapter 3] and Kemeny & Sne ll [19, Section 1.9]. 

The probability tree for a Markov chain is special, because the Markov Condition states 
that when the system jumps from state =x„ to a new state X(n + 1), where the system 
goes to will only depend on the state X{n) = x„ the system was in at time n, and not on its 
states X (A;) = at previous times k— l,2,...,n — l.In other words : 

q{-\xi;„)=qn{-\x„), xi:,, e ^",n = 1,...,A^-1, (1) 

where qn{-\xn) is some probability mass function on The Markov chain may be non- 
stationary, as the transition probabilities on the right-hand side in Eq. ([TJ are allowed to 
depend explicitly on the time n. Figure |2] gives an example of a probability tree for a 
Markov chain with ^ = {«, ^} and N = 3. 




{a,a,a) {a,a,b) {a,b,a) {a,b,b) {b,a,a) {b,a,b) {b,b,a) {b,b,b) 

Figure 2 . The probability tree for the time evolution of a Markov chain 
that can be in two states, a and b, and can change state at each time 
instant n = 1,2. 

With the local probability mass functions nii and qn{-\x„) we associate the linear real- 
valued expectation functionals Ei and £'„(-|x„), given, for all real-valued maps h on 
by 

E\[h) ■.^Y^h{xi)mi(xi) and £'„(/i|x„) ^/i(x„+i)^„(x„+i |x„) (2) 

Throughout, we will formulate our results using expectations, rather than probabilities 
Our reasons for doing so are not merely aesthetic, or a matter of personal preference; they 
will become clear as we go along. 

In any probability tree, probabilities and expectations can be calculated very efficiently 
using backwards recursionQ Suppose that in situation xi „, we want to calculate the con- 
ditional expectation E{f\xi-„) of some real-valued map / on that may depend on the 
values of the states X{1), . . . , X{N). Let us indicate briefly how this is done, also taking 
into account the simplifications due to the Markov Condition (HJ. 

For these simplifications, a prominent part will be played by the so-called transition 
operator^Tf, and T„. Consider the linear space of all real-valued maps on 

^Arguments for the 'expectation approach' to probability theory were given by Whittle [sj. This approach 
is also central in the work of de Finetti |11]. For classical, precise probabilities, whether we use the language 
of probability measures, or that of expectation operators, seems to be a matter of personal preference, as the 
two approaches are formally equivalent. But for the imprecise-probability models we introduce in Section |2] it 
was argued by Walley 1 3?] that the (lower and upper) expectation language is mathematically superior and more 
expressive. 

'^See Chapter 3 of Shafer's book |29] on causal reasoning in probability trees, which contains a number of 
propositions about calculating probabihties and expectations in probability trees. That such backwards recursion 
is possible, was arguably discovered by Christiaan Huygens in the middle of the 17-th century. Shafe3 12^ Ap- 
pendix A] discusses Huy gens's treatment [16t Appendix VI] of a special case of the so-called Problem of Points, 
where Huygens draws what is probably the first recorded probability tree, and solves the problem by backwards 
calculation of expectations in the tree. 

^The operators T„ are also called the generators of the Markov process; see Whittle [stIi . 
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Then the Hnear operator (transformation) T„ : — + ^(^) is defined by 

T„h{x„) ■.= E„{h\Xn) =Y,Kxn+l)qn{Xn+l\Xn) (3) 

for all real- valued maps h on ^.In other words, T„/i is the real-valued map on ^ whose 
value Tnh{x„ )mx„ G ^ is the conditional expectation of the random variable h{X{n+l)), 
given that the system is in state x„ at time n. More generally, we also consider the linear 
maps T„ from ^( jr"+i) to ^{^"), defined by 

ir„/(xi:„) T„{f{xi:„,-)){x„) 

^ E„{f{xi:„,-)\x„) =Y,fixi:n,x„+i)q„{x„+i\x„) (4) 

for all xi-n S and all real-valued maps / on ,^"+'0 

We begin our illustration of backwards recursion by calculating E{f\xi-n) for the case 
n=N -I. Here 

£'(/|-^l:A'-l) =£'(/(xi:Ar-l,-)kl:A'-l) 

= ^f{xi:N-l,XN)q{xN\xi;N-l) 

= Y,f(^i-N-UXN)qN-l(xN\xN-l) ^^N-lf{xi:N-l), (5) 

A'jvG.a^ 

where the third inequality follows from the Markov Condition ([T]i, and the fourth from 
Eq. (ffli. Using similar arguments for n — N ~2,we derive from the Law of Iterated Expec- 
tation^] that 

Eif\xi;N-2) ^E{E{f{xi;N^2,-,-)\xi:N-2,-)\xi:N-2) = "^N-l^N-lfixv.N-l)- (6) 

Repeating this argument leads to the backwards recursion formulae 

Eif\xi:„) = T„T„+i . . . T^_i/(xi:„) (7) 

for n = 1, . . . ,N — I, while for n = 0, we get 

£(/) £(/!□) = El (T1T2 . ..Tn-i/)- (8) 

In these formulae, / is any real-valued map on In Figure [3] we give a graphical 
representation of calculations using the backwards recursion formulae (|7]i and (O, for a 
two-state stationary Markov chain. 

For instance, if we let / be the indicator functions I{xi.i^} of the singletons {jcla?}. For- 
mulae Q and (O allow us to calculate the joint probability mass function p defined by 
p{xi:n) = ^(^{xi ,v}) for all the variables X{1), . . . , X{N). We can also use them to find the 
conditional mass functions p„{-\x„) and p{-\xi:„) defined by p„{x„+i;n\x„) — p{x„+i-N\xi-n) = 

1.2. The Perron-Frobenius Theorem for classical Markov chains. We are especially 
interested in the case of a stationary Markov chain, and in the (marginal) expectation E„ [h] 
of a real-valued map h (on ,9^) that depends only on the state X{n) at time n. Here, Eq. ^ 
becomes 

£„(/z):=£i(T"-i/z), (9) 

where T := Ti = T2 = • • • = TAf_i, and where we denote by T*^ the fe-fold composition of 
T with itself; in particular, T" is the identity operator id on ^{S^). If we let h — I{x„), this 
allows us to find the probability mass function m„{x„) = E„{I^^^j), x„ S ^ for the state 
X{n). 



The T" can be seen as projection operators, since (with some abuse of notation) T„ oT„ = T„. 

n 

Also kno wn as th e Rule of Total Expectation, or the Rule of Total Probabihty, or the Conglomerative Prop- 
erty; see, e.g., IWhittlj [stI Section 5.3] or de Finetti |JJJ. 
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£(/)=£i(TiT2/) 



t £(/|a)=TiT2/(fl) 


E{f\b)^f,T2f{b) ] 




1 1 


J E(f\a,a) = T2/(a,a) E(f\a,b) = fifia.b) 


E(f\b,a) = T2/(fc,fl) = T2f{b,b) \ 
1 1 1 1 



f{a,a,a) 



f{a,b,b) 


f{b,a,a) 


f{b,a,b) f{b,b,a) 
1 


f{b,b,b) 1 













Figure 3 . Backwards calculation of the conditional and joint expecta- 
tions of a real- valued map / on for a stationary Markov chain with 
state set ^ — {a,b}, and a uniform probability mass function attached 
to each non-terminal situation. 

By the way, the linear transition operator T is very closely related to the so-called 
Markov, or transition, matrix T of the stationary Markov chain, whose elements for all 
{x,y) G are defined by 

T„:=q{y\x)^TI^,}{x). (10) 
Any such transition matrix satisfies the conditions T^^ > and L^-gJ ^xz ~ 1- We will 
henceforth call transition matrix any matrix satisfying these properties^ The probability 
counterpart of the expectation formula (|9]l can then be written in matrix form as: 

m„=,7iiT"-\ (11) 

where, here and further on, we also use the notation m„ for the row vector whose compo- 
nents are the probabilities m„{x„), x„ e ,S^. 

Under some restrictions on the transition operator T, the classical Perron-Frobenius 
Theorem then tells us that, as n (as well as the time horizon A^) recedes to infinity, this 
probability mass function m„ converges to som e limit, independently o f the initial p rob- 
ability mass function mi; see iKemeny & Snelll ifli Theorem 4. 1 .6] and iLuenbergen 11221 



Chapter 6]. In terms of expectation functionals and transition operators: 

Theorem 1.1 (Classical Perron-Frobenius Theorem, Expectation Form). Consider a sta- 
tionary Markov chain with finite state set and transition operator T. Suppose that T is 
regular, meaning that there is some k > such that minT^I^^j > Q for all x in Then 
for every initial expectation operator E\, the expectation operator E„ ^ Ei o T"^' for the 
state at time n converges point-wise to the same limit expectation operator Eaa: 

lim£„(/!) =lim£i(T""'/!) =:£„(/;) for all h e ^{IT). (12) 

Moreover, the limit expectation Eoo is the only T-invariant expectation on ^{S^'), in the 
sense that Eac = E„ o T. 

2. Towards imprecise Markov chains 

The treatment above rests on the assumption that the initial probabilities and the transi- 
tion probabilities are precisely known. If such is not the case, then it seems necessary to per- 
form some kind of sensitivity analysis, in order to find out to what extent any conclusions 
we might reach using such a treatment, depend on the actual values of these probabilities. 



o 

In the literature we also find the term stochastic matrix, see Hartfiel 1 15], for instance. 
^This means that there is a > such that all elements of the k-t\i power of the transition matrix T are 
(strictly) positive. Matrices with this properly are sometimes called regular as well, but this same name is also 
used for other matrix properties. Another name for this property is 'primitive' 



6 



GERT DE COOMAN, FILIP HERMANS, AND ERIK QUAEGHEBEUR 



A very general way of performing a sensitivity analysis for probabilities involves calcu- 
lations with closed convex sets of probability mass functions, also called credal sets, rather 
than with single probability measures. Let E j- denote the set of all probability mass func- 
tions on , an (I j?r| — 1) -dimensional unit simplex in the | -dimensional linear space 
R * , then [m e E,ar : (Va; e S^)m{x) < 5 } is a credal set, but [m e : {3x e jr)m(jc) > 
is not. 

There is a growing body of literature on this interesting and fairly new area of imprecise 
probabilities, starting with the publication of Wallev's |33] seminal work. We refer to the 
literature Js, 33l 34, 35] for more details and discussion. 

Let us recall a number of results for credal sets, important for the developments in this 
paper. Proofs can be found in Walley's book fssl Chapters 2 and 3]. Specifying a closed 
convex set 3^ of probability mass functions p on a finite set ?V is equivalent to specifying 
its lower and upper expectation (functionals) E_ip : ^(^^) M and Elj^ : ^{^^) — > M, 
defined for all g £ i^(^) by 

E,^{g) := min {E p{g) : pe^} and E,^{g) := max{Ep{g): p e 3^} , (13) 

where Ep{g) — Y.yeS^ g{y)p{y) is the expectation of g associated with the probability mass 
function p. In a sensitivity analysis, such functionals are quite useful, because they give 
tight lower and upper bounds on the expectation of any real-valued map. Since the func- 
tionals and Etp are conjugate in the sense that E_y^{g) = —E,^{~g) for all real-valued 
maps g on '3^, one is completely determined if the other is known. Below, we concentrate 
on upper expectations. Any upper expectation E —E associated with some credal set 3^ 
satisfies the following properties [see, e.g.|33l Section 2.6.1]: 

(£1) ming < E{g) < maxg for all g in ^(^^) (boundedness); 

(£2) E_{gi +g2)^E{gi)+E{g2) for all gi and g2 in J^(^) (subadditivity); 

(£3) E{Xg) — XE{g) for all real A > and all g in ^{'3^) (non-negative homogeneity); 

(£4) E{g + piLjr) = E{g) + II for all real ll and all g in ^{'W) (constant additivity); 

(£5) if < g2 then E{g\) < £(^2) for all ^1 and g2 in ^ {^^) (mono tonicity); 

(£6) if g„ ^ g point-wise then E{g„) E{g) for all sequences g,, in ^{^) (continu- 
ity); 

(£7) £(g) > —E{—g) = E_{g) for all g in ^(^^) (upper-lower consistency). 
Convers ely, f or an y real functional £ that is defined on ^{W) and that satisfies the con- 
ditions ( l£ll i-( [£3] l. there is a unique credal set ^ C E j- such that £ coincides with the 
upper expectation £^, namely ^ = {/? e E;^: (V/ G J^(^))£p(/) <£(/)}. Such an £ 
therefore automatically also satisfies conditions ( l£4l i-( [£7l ). It therefore ma ke se n se to call 
upper expectation any real functional E on J^{'3^) that satisfies properties (l£ll l-( l£3l l. 

What is the upshot of all this for the Markov chain problem we are considering here? 
First of all, in the initial situation □, corresponding to time n — Q, rather than a single 
initial probability mass function m\, we now have a local credal set ^\ of candidate mass 
functions m\ for the state X{\) that the system will be in at time k=\. We denote by £1 
the upper expectation associated with 

£1 {h) := maxj ^/!(x)mi (x) : mi e J^i \ for all he^{^). (14) 

Also, in any situation jci,, e corresponding to time n = l,2,...,A^— 1, instead of a sin- 
gle transition mass function q„{-\xn), we now have a local credal set ^n{-\xn) of candidate 
conditional mass functions q„ ( • |x„ ) for the state X (n + 1 ) that the system will be in at time 
n + 1. We denote by E„{-\x„) the upper expectation associated with i?„(-|jic„), i.e.: 

£„(/!|x„):=max| ^/!(x)^(x): ^G^„(-|x„)| for all /; G if ( J^'). (15) 

We call the resulting model an imprecise Markov chain. Figure |4] gives an example of a 
probability tree for an imprecise Markov chain. It is an imprecise-probability tree where 
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the local conditional models satisfy the Markov Condition: 

i2(-|jci:„) = ^(-|x„) forallxi:,, e JT" andn= 1,2,...,A^-1. (16) 

A classical, or precise, Markov chain is an imprecise one with credal sets that are single- 
tons. 




{a, a, a) {a,a,b) (a.b.a) (a,b,h) {b,a,a) (b.a.b) (b.b.a) {b,b,b) 

Figure 4. The tree for the time evolution of an imprecise Markov chain 
that can be in two states, a and b, and can change state at each time 
instant n= 1,2. 

How, then, can a sensitivity analysis be performed for such an imprecise Markov chain? 
We choose, in each non-terminal situation xi-^ of the above-mentioned event tree, a local 
transition probability mass ^(-1x1:^:) in the set of possible candidates |xj-)0 For k = 0, 
we get the initial situation □, where we choose some element nii in the set of possible 
candidates . By making a choice of local model for each non-terminal situation in 
the event tree, we obtain what we call a compatible probability tree, for which we may 
calculate all (conditional) expectations and probability mass functions: 

Af-l 

E{f\xi:„) = Y.f{xi:„,X„+i;N)Y[Qixk+l\xi:k), (17) 

N-l 

E{f) ^J^f{xi:N)m{xi)Y[l{^k+l\xi:k), (18) 

for n = 1, . . . ,N — I, and for all real- valued maps / on . As we have just come to 
realise, the probability trees that are compatible with an imprecise Markov chain are no 
longer necessarily (precise) Markov chains themselves. It is still possible to calculate the 
E{f\xi-„) and £(/) in Eqs. (fTTb and ( fTSl l using backwards recursion |29L Chapter 3], but the 
formulae for doing so will be more complicated than the ones for precise Markov chains 
given by Eqs. (|7]i and 

If we repeat this for every other choice of the mi in and the q{-\xi±) in J3ii{-\xii), 
we end up with an infinity of compatible probability treesQ for which the associated (con- 
ditional) expectations and probability mass functions turn out to constitute closed convex 
sets. We denote their corresponding upper expectation functionals on ^{^^) by E{-\xi-„) 
and E. These upper expectations, and the conjugate lower expectations, are the final aim 
of our sensitivity analysis. 

The procedure we have just described is computationally very complex. When the 
closed convex sets .-#1 and cSj:(-|x) each have a finite number of extreme points (are poly- 
topes), we can limit ourselves to working with these sets of extreme points, rather than with 
the infinite sets themselves. But even then, the computational complexity of this approach 
will generally be exponential in the number of time steps. 



These local transition probability masses themselves depend on the situation Xi ± they are attached to, but 
the sets they are chosen from only depend on the last state aj, as the Markov Condition U6t tells us. 

'^Except when all the credal sets are singletons, of course. 
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However, we will see in Section|3]that the upper expectations E and£'(-|xi:„) associated 
with the closed convex sets of (conditional) probability mass functions for the compatible 
probability trees of an imprecise Markov chain can be calculated in the same way as the 
expectations E and E{-\x\;„) in a precise one: using counterparts of the backwards recur- 
sion formulae (|7])-(|9]l. Because of this, making inferences about the mass function of the 
state at time n, i.e., finding the upper envelope of the £„ given in Eq. (|9]l now has a 
complexity that is linear, rather than exponential, in the number of time steps n. This is our 
first contribution. 

Our second contribution in this paper is a Perron-Frobenius Theorem for a special class 
of so-called regularly absorbing stationary imprecise Markov chains: in Section|5]we prove 
a generalisation ofTheorem ll.il which tells us that under fairly weak conditions, the upper 
expectation operators E„ converge to limits that do not depend on the initial upper expecta- 
tion operators E \ . Our result also extends a number of other related convergence theorems 
for imprecise Markov chains in the literature 1 13, 14, 15, 32]. 

3. Sensitivity analysis of imprecise Markov chains 

We can now take our most important step: deriving the backwards recursion formulae 
for the conditional and joint upper expectations in an imprecise Markov chain. We first de- 
fine upper transition operators T„ and T„. The operator T„ : ^(^') ^{,'^) is defined 

T„h{xn) ■.= E„{h\x„) (19) 

for all real-valued maps h on 2^ , and all x^ in ^ . In other words, T„/i is the real-valued 
map on , whose value T„/!(x„) in x„ G ^ is the conditional upper expectation of the 
random variable h{X{n + 1)), given that the system is in state x„ at time n. More generally, 
we also consider the maps T„ from if ( jr"+^) to ^( JT"), defined by 

T„/(xi:„) := {T„f{xi;„,-)){x„) =£'„(/(xi:„,-)|x„) (20) 

for all jci„ in and all real-valued maps / on Of course, we can also consider 

lower expectations and lower transition operators, which are related to the upper expecta- 
tions and upper transition operators by conjugacy. As is the case for upper expectations, it 
is possible to introduce the notion of an upper transition operator directly, by basing it on a 
number of defining properties, rather than by referring to an underlying imprecise Markov 
chain. We refer to the Appendix for more details. 

The upper expectations E{-\xi-„) and E on ^(^^) can be calculated very easily by 
backwards recursion, cfr. ^ and ([8]). 

Theorem 3.1 (Concatenation Formula). For any xj „ in n — l,...,N —1, and for any 
real-valued map f on 

= T„T„+i ...T^_l/(xi:„) (21) 

£(/)=£,(fif2...TiV-i/). (22) 

Call, for any non-empty subset / of { 1 . . . , A^}, a real-valued map / on I-measurable 
if f{x\;N) — f{zi:N) for all xi-^ and zi-.n in such that ~ Zk for all k E I. In other 
words, an /-measurable / only depends on the states X{k) at times A; G /. As an example, 
an {«}-measurable map h only depends on the state X{n) at time n, and we identify it with 
a map on ^ (but remember that it acts on states at time n). The following proposition tells 
us that all conditional upper expectations satisfy a Markov Condition (cfr ([T]i). 

Proposition 3.2 (Markov Condition). Consider an imprecise Markov chain with finite state 
set and time horizon N. Fix n £ {I, . . . ,N — 1}. Let xi:„_i and be arbitrary 

elements of <^"^\ and let x„ G Let f be any {n, n + 1 , . . . -measurable real-valued 
map on . Then £'(/|xi:„_i,x„) — £'(/|zi:„_i ,x„), so we may write £'(/|xi:„_i,jc„) = 

E\n{!Vn)- 
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The index '|n' is intended to make clear that we are considering an expectation conditional 
on the state X{n) at time n. 

If we apply the joint upper expectation E to maps h that only depend on the state 
X{n) at time «, we get the marginal upper expectation E„{h) :— E{h), and E„ is a model 
for the uncertainty about the state X{n) at time n. More generally, taking into account 
Proposition 13.21 we use the notation E„^£{h\x() := E]^({h\x() for the upper expectation of 
h{X{n)), conditional on X{£) = X( with I < £ < n. With notations established in Eq. ( fTsT l. 
E„_f_i^„{h\xn) — E„{h\xn) — T„h{x„). Such expectations can be found using simpler recur- 
sion formulae than Eqs. ( 1211 1 and ( l22b . as they are based on the simpler upper transition 
operators T^^. 

Corollary 3.3. For any real-valued map h on and for any 1 < £ < n < N and all X( 
in : 

E„\i{h\x()^i:iY(+x...T„^\h{xi>) and £'„(/z) = £1 (T1T2 . . .T„_i/i). (23) 

This offers a reason for formulating our theory in terms of real-valued maps rather than 
events: suppose we want to calculate the upper probability £'„(A) that the state X{n) at 
time n belongs to the set A. According to Eq. ( |23] |. E,-,{A) ~ E\{Ti . . . T„_i/a), and even 
if T„-i/^ can still be calculated using upper probabilities only, it will generally assume 
values other than and 1 , and therefore will generally not be the indicator of some event. 
Already after one step, i.e., in order to calculate T„_2T„-i/a, we need to leave the ambit 
of events, and turn to the more general real-valued maps; even if we only want to calculate 
upper probabilities after n steps. 

For joint upper and lower probability mass functions, however, we can remain within 
the ambit of events: 

Proposition 3.4 (Chapman-Kolmogorov Equations). For an imprecise Markov chain, we 
have for all \ <n < m <N and all {x„,x„+i-,„) G ^m-n+i ^^^^ 

m— 1 

E\„{{Xn+l:„,}\Xn) =YY^I'h^k+l}^^l''>' ^^^^^ 

k=n 

and for all I <m<N and all x\-m G 3t^'^^ that 

m— 1 

E{{x,.,„,})=Ey{{xr})Y\lkI{,,^,}{xk). (25) 

k=\ 

There are analogous expressions for the lower expectations. 

4. Accessibility relations 

From now on, and for the rest of the paper, we mainly consider stationary imprecise 
Markov chains with an infinite time horizon. This means that for each time n G N, we 
consider the same upper transition operator T„ = T. 

The classification of the states of such a stationary (im)precise Markov chain can be 
fruitfully started by introducing a so-called accessibility relation • •: let jc and y be any 
two states in j?r and let n be a number of steps in No := N U {0}, then x -^y expresses 
that y is accessible from x in n steps. To be an accessibility relation, a generic ternary 



relation • ■ has to satisfy the defining properties: 

(Vx,y G ^)x^y<^x = y, (26) 

(Vx,y,z G ^){ym,n G No)x A y and y z^x z. (27) 

(Vxe^) (V« G N) {3y G A y. (28) 



An accessibility relation is classically derived from the transition matrix of a stationary 
Markov chain; in Section l4!2l we will associate such a relation with a stationary imprecise 



10 



GERT DE COOMAN, FILIP HERMANS. AND ERIK QUAEGHEBEUR 



Markov chain. But for any (abstract) accessibility relation satisfying the conditions 
(|28] |. we can draw all the following conclusions, no matter what transit ion matrix or op- 
erator it was derived from, or whether it comes about in any other way; iKemeny & Snelll 



II19L Section 1.4] give a deta iled justification. In what follows, we use the terminology in- 
troduced by Kemenv & Snell. but we want to remind the reader that the terms we use may 
also have various other meanings in different parts of the Uterature. 

4. 1 . Abstract accessibility relations. Accessibility relations give rise to many interesting 
concepts, which we discuss below. We refer to Figure|5]for a graphical representation. 



D4 



D5 



D3 



Ci 




C3 



Figure 5 . Three increasingly finer partitions of the state set ^ for a 
particular stationary (im)precise Markov chain, or more generally, for an 
accessibility relation ■ ■. No transition between states of the classes 
Ci, C2, and C3 is possible, and these classes can be seen as separate 
(im)precise Markov chains. The equivalence classes Dj^ for the commu- 
nication relation are partially ordered by the relation whose (Hasse) 
diagram is represented by the upward arrows. Maximal classes are D4, 
D$, Dg, and D9, the other classes are transient. If D4, D5, D%, and D9 are 
aperiodic, the accessibility relation restricted to respectively Ci, C2, and 
C3 is respectively maximal class regular, top class regular, and regular 



Consider any two states x and y in Then y is accessible from x, which we denote as 
x^y,if there is some n e No such that x -^y.lf x and y are accessible from one another, 
then we say that x and y communicate, which we denote as x <^ y. 

It follows from Eqs. ( |26] | and dZTl l that the binary relation ^ on ^ is a preorder, i.e., is 
reflexive and transitive. The binary relation <^ on is the associated equivalence relation. 
This communication relation «^ partitions the state set ^ into equivalence classes D of 
states that are accessible from one another, called communication classes. The preorder 
induces a partial order on this partition, also denoted by 

Undominated or maximal states with respect to the preorder ^ are states x such that 
X y ^ y X for any state y in This means that a maximal state has access only 
to other maximal states in the same communication class, and to no other states. Collec- 
tions of maximal states, such as the communication classes they belong to, are also called 
maximal. The other states and collections of them, such as the communication classes they 
belong to, are called transient. If all maximal states communicate, or in other words if there 
is a unique maximal communication class, this class is called the top class. It is made up 
of those states that are accessible from any state. 

Consider, for any x and y in J^, the set 

N„:={neN: x-^y}, (29) 

i.e., those numbers of steps after which y is accessible from x. We call the period of a 
state X the greatest common divisor of the set N^x, ie-, dx '■— gcd{n g N: x x}. Because, 
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by Eq. ( l27b. A^vr is clos ed under addition, we can rely on a basic number-theoretic result 



(see, e.g.. lKemeny & Sne ll [19, Theorem 1.4.1]) which tells us that N-^x is, up to perhaps a 



finite number of initial elements, equal to the set of all multiples of dx- 

Now consider an equivalence class D of communicating states, and any two states x 
and y in that class. Then it is not difficult to show that they have the same period: x y 
dx — dy. We denote by do the common period of all elements of the equivalence class D. 

Proposition 4.1. Consider arbitrary x and y in some maximal communication class D. 
Then there is some < txy < do such that n G Nxy =^ n = txy (mod do), i-e., n and txy are 
equal up to some multiple of do- Moreover, 

(3« e N) (Vfe > n) txy + kdo e Nxy. (30) 

For any x, y and z in this equivalence class D, txy + ty^ = txz (mod do), and therefore ty^ — 
if and only if txy — txz- This implies that 'ty^ = 0' determines an equivalence relation on 
this equivalence class D, which further partitions it into do subsets, called cyclic classes. 
In such a cyclic class, all states y give the same value to txy, for any given x in D. Within D, 
the system moves from cyclic class to cyclic class, in a definite ordered cycle of length do- 
If D is transient, then in some cyclic classes it is possible that, rather than moving to the 
next cyclic class, the system moves to (a state in) another equivalence class D' for the 
communication relation that is a successor to D for the partial order 

If do= 1, or in other words if txy — for all x,y ^ D, then there is only one cyclic class 
in D, and we call the communication class D, and all its states, aperiodic. If D is moreover 
maximal, then D is called regular. The foll owing general cha racterisation of regularity is 
easily derived from Proposition l4.lt see also Kemenv & Snelll 's arguments lfl9l. Chapters 1 
and 4]. 

Proposition 4.2. A communication class D <Z is regular under the accessibility relation 
■ ■ if and only if 

(3neN)(yk>n)(yx,yeD)x-^y. (31) 

An interesting special case obtains when there is only one equivalence class for the 
communication relation (namely ^'), so ^' is maximal, and there is only one cyclic class 
(namely meaning that all states are aperiodic. In that case, the accessibility relation 
• ■ is called regular as well. If all maximal communication classes are regular (aperiodic), 
the accessibility relation is called maximal class regular. If there is only one maximal com- 
munication class, and if this top class is moreover regular (aperiodic), then the accessibility 
relation is called top class regular. Top class regularity has the following simple alternative 
characterisation. 

Proposition 4.3. An accessibility relation ■ --^ ■ is top class regular if and only if the cor- 
responding set of so-called maximal regular states is non-empty: 

^.^ = {xe {3ne N){yk > «)(Vy G ^)y x} ^ 0; (32) 

and in that case this set is the top communication class. 

4.2. Accessibility relations for imprecise Markov cliains. Because we now only con- 
sider stationary imprecise Markov chains, this means that for each time « e N, we consider 
the same transition models ^„{-\x) ~ ^{■\x), x e or equivalently, for the upper transi- 
tion operators: T„ = T and T„ = T. 

Let us denote by P^. the upper probability of going in n steps from state x to state y. For 
n — Q, F^. =/{,.} (jc), and forn > 1, /J" =Ei^j^„\i^{{y}\x), where — ^because of stationarity — the 
right-hand sides does not depend on e N. By Corollarv l3.3l we find that = T"/{,.} [x) for 
all n £ Nq. The following two propositions allow us to associate an accessibility relation 
with the upper transition operator They are immediate generalisations of similar results 
involving (precise) probabilities in (precise) Markov chains: 
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Proposition 4.4. For all x, y and z in 2^ , and for all n and m in Nq, 

pn+m^pn^n^ (33) 

Proposition 4.5. For all x in ^ , and for all n in No, there is some y in such that >0. 

Because of these results, which ensure that Eqs. dZTI i and (l28T l are satisfied [Eqs. (|26] | is 
trivially satisfied because = Iyj{x)], we can define an accessibihty relation • — > • using 
the I^: for any x and y in ^ and any « G Nq: 

X Ay^^^^; >O^T"/{,.}(x) >0. (34) 

Clearly, x A y if there is some compatible probability tree in which it is possible (meaning 
that there is a non-zero probability) to go from state x to y in n time steps. In other words, 
X y if it is not considered impossible in the context of our imprecise-probability model 
to go from X to y in n steps: we then say that y is accessible from x in n steps; and if x ^ y 
then y is accessible from x. 

The following notion will be essential for the convergence result we present in the next 
section. It involves both lower and upper transition probabilities. 

Definition 4.1 (Regularly absorbing). A stationary imprecise Markov chain is called regu- 
larly absorbing if it is top class regular (under -^), meaning that 

:= |x e JT : {3n e N) {\/k > n) (Vy G Jr)T*^/{^} (y) > o} 7^ 0, (35) 

and if moreover for all y in \ there is some n G N such that T"/tf ^ [y) > 0. 

In particular, an imprecise Markov chain that is regular (under — >, meaning that the acces- 
sibility relation is regular) is also regularly absorbing (under —^) in a trivial way. 

5. Convergence for stationary imprecise Markov chains 

We call an upper expectation £ on^(,^) T-!>ivaria«/ whenever iioT = £, so whenever 
£(T/j) = E[h) for all h G ^(^). 

Tlieorem 5.1 (Perron-Frobenius Theorem, Upper Expectation Form). Consider a station- 
ary imprecise Markov chain with finite state set 2^ that is regularly absorbing. Then for 
every initial upper expectation E\, the upper expectation En ~ E\ oT"^' for the state at 
time n converges point-wise to the same upper expectation Eoc,: 

lim£„(/z) = lim£i(T""^/i) -. E^{h)forall h in ^( JT). (36) 

Moreover, the limit upper expectation Eao is the only T-invariant upper expectation on 
^(jT). 

Let us compare this convergence result to what exists in the literature. 

The classical Perron-Frobenius Theorem 1 1.1 1 is of course a special case of our Theo- 
rem l5.1l because if (the transition operator of) a precise stationary Markov chain is regular 
in the sense of Theorem II.II then it is also regular (under ^), and therefore regularly 
absorbing. 

Other authors have presented convergence results for stationary imprecise Markov chains, 
namely Hartfiel & Seneta [\3\, Hartfiel Hill], and Skulj fdl]. They all use the following 
approach. They consider some set ^ of (one-step) transition matrices T, and deduce from 
that a corresponding set ^" of n-step transition matrices given by 

:7":={TiT2...T„:TuT2,...,T„e£^}. (37) 

Hartfiell calls the sequence n G N a Markov set chain. If we also have a set ^1 of (mar- 
ginal) mass functions mi for X(l), then they take the corresponding set ^„ of (marginal) 
mass functions forX(n) to be 

^„ = {niiT: mi G ^1 and T G } , (38) 
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where, as before, we also denote by m the row vector corresponding to the mass function m. 
If we furthermore also denote by h the column vector corresponding to the values h{x) 
of the real- valued map h in all x G then we find that the corresponding set S'„{h) of 
expectations of h{X{n)) is given by 

S„{h) = {miTh: mi e and T e 5^""'} . (39) 

Incidentally, these are also the formulae that can be obtained by considering imprecise Mar- 
kov chains to be special cases o f so-called credal networks under a strong independence 
assumption; for more details, see Cozman 's work 10, S] for instance. 



Skulj Il32l] considers the set ,3^ of transition matrices T corresponding to a so-called 
interval stochastic matrix, meaning that cfJ^ is the set of all transition matrices such that 
]^<T <T, where T_ and T are so-called lower and upper transition matrices; see also Sec- 
tion l6.3l for the related model in terms of upper transition operators. Hartfiel fTT] considers 
arbitrary sets of transition matrices, but in his book [15.1 he also focuses mainly on interval 
stochastic matrices. 

What is the relationship between the Markov set-chain model and the model involving 
upper transition operators we have studied and motivated above? Consider a stationary 
imprecise Markov chain with upper transition operator T. For each state x, as Th{x) has 
been defined as a conditional upper expectation E{h\x), there is a corresponding credal set 
£^j{-\x) given by 

{q{-\x) e Is- : (V/i e if ( < Thix)} , (40) 

and then also 

T/z(a;) =max{£',/(.|,)(/z): q{-\x) e ^ji-\x)} . (41) 
With these credal sets, we can associate a set of transition matrices 5^: 

5^ := {r e K'^X'^' : (Vx e ^){3q{-\x) e ^j{-\x)){Vy e = q{y\x)Y (42) 

In other words, each row T^. of any such transition matrix is formed by the transition 
probabilities corresponding to some element of J^j{-\x). The elements T of 3^ are the 
transition matrices that can be constructed using the one-step information contained in the 
conditional credal sets J2t^(-\x) and therefore in the (one-step) upper transition operator T. 
More generally, the set contains all n-step transition matrices that correspond to the 
«-step upper transition operator T" (see the Appendix for more details about why we can 
also consider T" to be an upper transition operator). 

Proposition 5.2. Consider a stationary imprecise Markov chain with upper transition op- 
erator T and let n G N. Then 

(i) C 5^„; 

( ii) For all real-valued maps h on ^ there is some T € =5^' such that for all x G 2^, 
l"h{x) = (Th),; 

( Hi) For all real-valued maps h on 2^ and all x G 2^ , 

T"/z(jc) =max{(r/i).,: r G 5^"} and min{(r/z);,: T G ,5^"} . (43) 

We gather from the following counterexample that for n > 1, 3^' can be strictly included in 
• This shows that the model based on imprecise-probability trees and upper transition 
operators that we have been using, is more detailed than the Markov set chain model. Nev- 
ertheless, as Proposition 15 .2lliii] i indicates, both models yield very strongly related (if not 
identical) results as far as the calculation of marginal expectations forX(n) is concerned. 

Example 5.1. Consider T := (1 — £)id+/ji> emax, where < e < 1 and id is the identity 
operator, which leaves its argument real-valued map h unchanged: id/i = h. This corre- 
sponds to a special case of the contamination models (|47] | discussed in Section |6T| For 
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the corresponding 2-step transition operator, we find that = ( 1 — 5) id S max, with 
5:=e(2-e). 

Let =2, then the sets of corresponding transition matrices are 



1 - £1 £1 
£2 1 - £2 



: < £i , £2 < £ > and 3^2 = 



l-di 5i 
52 1-52 



0<5i,52<5 
(44) 

We now show that the set £^ is strictly contained in 3^2 ■ Any element of 3^ is given by 

1 — £l £i 1 — £3 £3 1 — £1 — £3 + £[£3 + £i£4 £1 + £3 — £l£3 — £l£4 

£2 1 — £2 £4 1 — £4 £2 + £4 — £2£4 — £2£3 1 — £2 — £4 + £2£4 + £2£3 

(45) 

for some < £i;£2,£3,£4 < £, and therefore clearly belongs to 3p.. But is is straightfor- 
ward to check that no choice of £1 , £2 , £3 , £4 in [0, £] corresponds to the element of 3p with 
5i -52-5 = £(2-£). ♦ 



Skulj 0321] calls a compact set 3 of transition matrices regular if there is some « > 
such that Txy > for all T € 3" and all x,y £ He then shows that for such regular 3 
and for all compact the corresponding sequence of compact sets converges in 
Hausdorff norm to the same compact (and invariant) set ^00. It follows that for all h and 
all compact the sequence of compact sets S'n{h) will converge to the same compact set 
S'oc{h). This is a clear generalisation of the classical Perron-Frobenius Theorem ll.il But it 
follows from Proposition l5.2l that for a given s tationa ry imprecise Markov chain with upper 
transition operator T, the set 3^ is regular in lSkuljr s sense if and only if for some n S N, 
T"/{y}(ji:) > for all x,y ^ X . This is much stronger than even our strongest convergence 
requirement of regularity (under —^), which only involves the condition T"/|^,}(x) > for 
all JCj^ G ^. ISkuljl also proves a convergence result for conservative (too large) approxima- 
tions of the E„, in the special case of a regular (under — ^) imprecise Markov chain whose 
upper transition op erator is 2-alternating; see Section l673] for further details. 

We now turn to lHartfiell 's lfl3 , 14, LS] results. The strongest general convergence result 
seems to appear in his book llSi Sec. 3.2], where he uses the coefficient ofergodicity t{T) 
of a transition matrix T, defined by 
1 



T^iT) = T max £ \Tj,, - Ty,\ = 1 - min ]^ min{r;,j.,7;J. (46) 

A transition matrix is called scrambling if x{T) < 1 . iHartfiel calls a compact set 3 of 
transition matrices product scrambling if there is some m G N such that t{T) < 1 for all 
T G 3'". He then shows that for such product scrambling 3 and for all compact 
the corresponding sequence of compact sets ^„ converges in Hausdorff norm to the same 
compact (and invariant) set A gain, this is a generalisation of the classical Perron- 
Frobenius Theorem, and it includes ISkuljr s above-mentioned result as a special case. We 
believe, however, that this approach, based on the coefficient of ergodicity, has a number 
of drawbacks that our treatment does not have: the condition seems quite hard to check in 
practise, and it it is hard to interpret directly. We now also argue that it is too strong, at 
least from our point of view. 

Proposition 5.3. Consider a stationary imprecise Markov chain with upper transition op- 
erator T. If 3^ is product scrambling, then the chain is regularly absorbing. 

Moreover, as the following counterexample shows, it is easy to find examples of stationary 
imprecise Markov chains that are regularly absorbing but for which the corresponding set 
3^ is not product scrambling. Another, perhaps more involved, such counterexample will 
be presented near the end of Section l6!4l 



Example 5.2 (Vacuous imprecise Markov chain). Consider an arbitrary state set with 
at least two elements, and the upper transition operator T defined by T/i — max/i for all 
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real-valued maps hon The set 3^ that corresponds to this upper transition operator is 
the set of all transition matrices J^n, and consequently ~ 5^?' = Jj^n for all n e N as 
well. 

Consider the unit transition matrix T defined by T„ — 8xy [Kronecker delta], so the 
system remains with probability one in any state x that it is in. This T belongs to ^„ = 
for all n G N, but t{T) = 1, so J^h is not product scrambling. 

But the chain is regularly absorbing! It is even regular (under -^), in a trivial way: 
T"/{y}(x) — 1 for all n G N and all x,y G X . Observe that T" = max and therefore 
£00 = max for all £ 1 . ♦ 

6. Examples 

In this section, we indicate how the theory developed in the previous sections can be 
applied in a number of practical situations. For each of these, the upper expectations are 
of some special types that are described in the literature on imprecise probabilities. We 
present concrete and explicit examples, as well as a number of simulations. 

6.1. Contamination models. Suppose we consider a precise stationary Markov chain, 
with transition operator T. We contaminate it with a vacuous model, i.e., we take a con- 
vex mixture with the upper transition operator max of Example 15.21 This leads to the 
upper transition operator T, defined by 

T/i = (1 - e)T/! +/a.emax/i, (47) 

for all h e ^(^), where e is some constant in the open real interval (0, 1). The underly- 
ing idea is that we consider a specific convex neighbourhood of T. Since for all x in 
minT/|^} = (1 — z)vs\vs\Yli^^^ + £ > 0, this upper transition operator (or the associated im- 
precise Markov chain) is always regular (under ^), regardless of whether T is regular (in 
the sense of Theorem ll.lb ! We infer from Theorem lS.ll that. whatever the initial upper ex- 
pectation operator E \ is, the upper expectation operator E„ for the state X (n) at time n 6 N 
will always converge to the same E^o. 

What is this £00 is for given T and e? For any n > 1, 

_ n-1 

T"/i = (1 - £)"T"/i + /aK-££(l - £)VaxT'^/!, (48) 

<:=0 

and therefore 

n-1 

En+i{h) = (1 - £)"£i (r/i) + £^(1 - efxwcfT&h. (49) 

<r=0 

If we now let n °o, we see that the limit is indeed independent of the initial upper expec- 
tation E 1 : 

£oo(/i) = ££(!-£) VaxT*^/!. (50) 

Example 6.1 (Contaminating a cycle). Consider for instance — {a,b}, and let the 
precise Markov chain be the cycle with period 2, with transition operator T given by 
T/z(a) =h{b) and i:h{b) =h{a). Then T^"/? = h and t2"+1/i = T/z, and therefore maxT^"/; = 
maxT^"+^/z — max/z, whence Eoa{h) = maxh. So the limit upper expectation is vacuous: 
we lose all information about the value of X{n) as n 00. ^ 

Example 6.2 (Contaminating a random walk). Consider a random walk, where ^ — {a,b} 
and Th = /^ ^W+Mfc) ^ xhen we find that E„{h) = emaxh + (1 - e)!^^2}+m, 4 
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Example 6.3 (Another contamination model). To illustrate the convergence properties of 
an imprecise Markov chain, let us look at a simple numerical example. Again consider 
^ — {a,b} and let the stationary imprecise Markov chain be defined by an initial credal 
set = {m £ : 0.6 < m{a) < 0.9}, and a contamination model of the type (4T\ , 
with e = 0.1, and for which the precise transition operator T is defined by the transition 
matrix 



T := 



q{a\a) q{b\a) _ 0.15 0.85 
q{a\b) q{b\b)\ ^ [o.85 0.15 

In Figure |6]we have plotted the evolution of E„{{a}) and Ej,{{a}), the upper and lower 
probability for finding the system in state a at time n, which can be calculated efficiently 
using Eq. (|49] l. 




F„(W) 

E„{{a}) 



Figure 6. The time evolution of (i) the upper and lower probability of 
finding the imprecise Markov chain of Example l6.3l in the state a (outer 
plot marks and connecting lines); and of (ii) the probability of finding the 
classical Markov chain of Example 16.31 in the state a (inner plot marks 
and connecting lines). The filled area denotes the hull of the evolution of 
this probability, under the contamination model of Example 16.31 for all 
possible initial mass functions. 



For comparison, we have also plotted the evolution of E„{{a}), the probability for find- 
ing the system in state a at time «, for a (precise) Markov chain defined by probability mass 
functions that lie on the boundaries of the credal sets defining the above imprecise Markov 
chain; to wit, its initial mass function is given by the row vector mi := [mi (a) mi{b)] = 
[0.9 0.1] and its transition matrix is [Jj i^s o.865j ^ Heie £oc({fl}) = E^{{b}) = 0.5. ♦ 

6.2. Belief function models. The contamination models we have just descr ibed are a spe- 
cial case of a more general and quite interesting class of models, based on IShafen s 11281] 
notion of a belief function. We can consider a number of subsets Fj, j ~ I, ... ,n of ^ , and 
a convex mixture of the vacuous upper expectations relative to these subsets: 



E{h) — V m{Fj)ma\h{x), 

7=1 ■'^^^^ 



(51) 



with m{Fj) > and 'E!j=i'f^{Fj) = 1. In Shafer's terminology, the sets Fj are called /oca/ 
elements, and the m^FjYs the basic probability assignment^ 

We can now consider imprecise Markov chains where the local models, attached to 
the non-terminal situations in the tree, are of this type. The general backwards recursion 
formulae we have given in Section |3] can then be used in combination with the simple 



'^Usually, in Shafer's approach, Eq. )5U is only considered for (indicators of) events, and it then defines a so- 
called plausibility function, whose conjugate lower probability is a belief function. Eq. j5U gives the point-wise 
greatest (most conservative) upper expectation that extends this plausibility function from events to real-valued 
maps. 
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formulae of the type ( fSTT l for an efficient calculation of all conditional and joint upper and 
lower expectations in the tree. We leave this implicit however, and move on to another 
example, which is rather more popular in the literature. 

6.3. Models with lower and upper mass functions. An intuitive way to introduce impre- 
cise Markov chains l31|] goes by wa y of so -called probability inte rvals, stu died 



in a paper by de Campos et al. [|2[]; see also .WallevI II33L Section 4.6.1] and iHartfiell 1115 



Section 2.1]. It consists in specifying lower and upper bounds for mass functions. Let us 
explain how this is done in the specific context of Markov chains. 

For the initial mass function mi, we specify a lower bound mi : ^ ^M., also called a 
lower mass function, and an upper bound mi : — > M, called an upper mass function. The 
credal set attached to the initial situation, which corresponds to these bounds, is then 
given by 

^1 := {m e Y.OJ/- : (Vx £ Jr')mi (x) < m(x) < mi (x)} . (52) 

Similarly, in each non-terminal situation xi± G k = 1, . . . ,N — I we have a credal 
set J2k{-\xii) that is defined in terms of conditional lower and upper mass functions qk{-\xk) 
and qk{-\xic). Here, for instance, qk{xk+i \xk) gives a lower bound on the transition probabil- 
ity qii{xk+i \xii) to go from state X{k) — x^ to state X{k+ 1) = x^+y at time k. 

Under some consistency conditions (for more details, see [2]) the upper expectation 
associated with ^\ is then given in all subsets A of ^ by 

£i(A)=min|£mi(z),l-£mi(z)j, (53) 

This El is 2-alternating: £i (A UB) +£i (A HB) < Ei (A) +Ei (B) for all subsets A and B 
of This implies (see [33, Section 3.2.4] and [6, Theorem 8 and Corollary 17]) that for 
all h G ^{^) the upper expectation £1 (/i) can be found by Choquet integration; 

max A 

Ei{h)=mmh + jEi{{ze^:h{z)>a})da, (54) 

min/i 

where the integral is a Riemann integral. Similar considerations for the 2-alternating Ek{-\xk) 
lead to formulae for the upper transition operators T<.: for all x^ in 

Tkhixk) = min <^ Y.^k{z\xk), 1 - J^qkiz\xk) > (55) 

'<z€A ze.t'\A J 

max/z 

Tkh{xk) = mmh + jTkt{zear: h{z)>a}{xk)da. (56) 

min/i 

Using £1 and the Tk, all (conditional) expectations in the imprecise Markov chain can now 
be calculated, by applying Theorem l3.1l and Corollary [33] 



Rather than using this backwards recursion method, Skulj ||31L 13211 uses forward prop- 
agation, which, reformulated using our notations, amounts to the following. The marginal 
expectation E2 is calculated by E2 = E\ oTi, £3 by £3 = E2 0T2, and more generally, 
En+i = E„ o T„. Even though it appears quite natural, this approach has an important draw- 
back, especially in the context of the probability interval approach described above. In 
order to calculate, say Ej,{h), we first need to find the upper expectation E2, and calcu- 
late its value in the map T2h. But E2, as the composition of two 2-alternating models Ei 
and Ti, is no longer necessarily 2-alternating, and therefore its value in the map T2h can- 
not generally be calculated from the values it assumes on events, using Choquet integration, 
as in Eqs. (l54l l and (l56l) . Indeed, Choquet integration will generally give too large a value 
for £3 (/•;), and will therefore lead to conservative approximations. These are the difficulties 



that lSkuljl is faced with in his work ||3U|32 I. 
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They can be circumvented by our backwards recursion approach. Indeed, in order to 
find E„{h), we begin by calculating hy := h and /i^+i := T^-Zij^, k ~ !,...,« — 1, using 
Eq. ( l56b . Finally, En{h) = Ei{h„) is calculated using Eq. (l54l i. Our calculations use Cho- 
quet integration but are tight, and not conservative approximations, because at all times, 
the intervening local upper expectations are 2-alternating. 

Example 6.4 (Close to a cycle). Consider a three-state stationary imprecise Markov model 
with — {a,b,c} and with marginal and transition probabilities given by probability 
intervals. It follows from Eqs. (l55T l and (l56T l that the upper transition operator T is fully 
determined by the lower and upper transition matrices: 

q(a\a) q{b\a) q{c\a) 
q{a\b) q(Jb\b) q{c\b) 
q{a\c) q{b\c) q{c\c) 

q{a\a) q{b\a) q{c\a) 
q{a\b) q(Jb\b) q(c\b) 
q{a\c) q{b\c) q{c\c) 

where the numerical values are particular to this example. We have depicted the credal sets 
\a), ^{■\b) and cS(- |c) corresponding to this upper transition operator in Fig. |7] 
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Figure 7. The credal sets ^{■\a), ^{■\b) and ^{■\c) in the simplex 
^{a.fc.c}' corresponding to the upper transition operator T in Example l6.4l 

Similarly, the initial upper expectation E\ is completely determined by the row vectors 
Ulx '■— \m\{d) mi{b) m\{c)] and m\ :— [mi (a) lni{b) mi{c)]. In Figure [8] we plot con- 
servative approximations for the credal sets ^„ corresponding to the upper expectation 
operators £„. 




w=l n = 2 n = 3 n = 4 n = 5 




n = 6 /J = 8 n=n M = 22 n = 1000 



Figure 8 . Evolution in the simplex of the credal sets ^„ for 

the near-cyclic transition operator from Example 16.41 for three different 
choices of the initial credal set . 

Each approximation is based on the constraints that can be found by calculating E_i (T"^ '/{^-j ) 
and £'i(T"^'/{^}) using the backwards recursion method, for x = a,b,c. The ^„ evolve 
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clockwise through the simplex, which is not all that surprising as the lower and upper 
transition matrices are quite 'close' to the precise cyclic transition matrix 

q{a\a) q{b\a) q{c\a) 
T:= q{a\b) q{b\b) q{c\b) 
q{a\c) q{b\c) q{c\c) 

as is also evident from Fig.|7] After a while, the ^„ converge to a limit that is independent 
of the initial credal set , as can be predicted from the regularity of the upper transition 
operator. ♦ 

A biological application of imprecise Markov models can be found in Dhaenen s^s Mas- 
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ter's thesis [9]. He used the sensitivity analysis interpretation of imprecise Markov models 
to investigate the legitimacy of using PAM matrices in amino acid and DNA sequence align- 
ments. Roughly speaking, PAM (point accepted mutation) matrices describe the chance 
that one amino acid mutates into another amino acid over a given evolutionary time span. 
However, the actual value of PAM matrix components are based on an estimation using an 
evolutionary model (i.e., amino acid substitutions are actually counted on the branches of a 
phylogenetic tree), hence the need to perform a sensitivity analysis. Dhaenens [9] observed 
in simulations that the imprecision due to the estimation did not blow up even after a large 
number of steps; he concluded that using PAM matrices over large evolutionary timescales 
is still reasonable. 

6.4. A ^-out-of-n:F system with uncertain reliabilities. Reliability theory is one field 
where Markov chains are used extensively. It concerns itself with questions of the type: 
What is the probability of failure of a system with n components? In the simplest case, 
where each component is either working or not working, answering this question would 
involve assessing the failure probabilities of the 2" possible configurations of component 
states. However, as shown by Koutras |20], a great variety of reliability structures can be 
evaluated quite efficiently using their so-called embedded Markov chain. Amongst these 
are precisely those systems that fail as soon as any k out of the n components fail, also 
known as ^-out-of-«:F systems. 

For such systems, the embedded Markov chain is constructed as follows. Its state space 
^ is given by {0, 1,2, . . . where each number represents the number of components 
that fail in the system. System failure is therefore represented by the event {k], and a fully 
functioning system by the event {0}. Koutras |20] shows that the failure probability (or 
unreliability) Fn and the reliability R„ = \ — F„ of a Markov chain embedded system are 
determined by the expectation form expression: 

F„ :=£„+!(/{,}) =£i(TiT2...T„/{,}), (57) 

where the initial distribution E\ represents a system in perfect working condition, soEi{h) = 
h{Q) for all real-valued maps h on The transition matrix 7) corresponding to the transi- 
tion operator T, is fully determined by the reliability p/ of the i-th component: 



Pi 


















Pi 










I -Pi 

1 



(58) 



where {Ti)( ,„ — T,7|„,| {£) and£,me{0,l,...,k}. 

Precise assessments of the individual reliabilities of the components pi are often diffi- 
cult to come by, as for example, they might depend on climatological parameters, age or 
maybe even on the failure of other (external) components. However, experts might still be 
able to give conservative bounds on the individual reliabilities p,. In this case, the embed- 
ded Markov chain becomes imprecise, but the corresponding bounds on the reliability and 
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unreliability can still be computed by applying our sensitivity analysis formulas derived 
above: 

F„-l-^=£i(TiT2...T„/{i,}) and £„ - 1 -:R„ - £i (liTj . . .L./^)- (59) 

When this embedded Markov chain is stationary (meaning that the uncertainty models for 
the reliability of all components are assumed to be the same), the failure probability bounds 
are simply computed by F„ = Ei (T"/n.} ) and F_„= Ei {X"I{k} ) ■ 

To give a very simple example, let us assume that an expert provides the same range 
[r,r] for all component failure probabilities where < r < r < 1. This leads to a special 
case of the models considered in Section 16.31 and if we apply the formulas derived there, 
we get, after some manipulations that 




-r)h{£+l) + {r-r)max{h{i),h{£+l)} 



ife- 
ife. 



0,1, 

k 



,k-i 



(60) 

for all real-valued maps /i on If h is non-decreasing in the sense that h{Q) < h{l) < 
■ ■ ■ < h{k — 1) < h{k), then so is Th, and it therefore follows that 
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(61) 



k-l 



(62) 



(:=Q 

and there is a completely similar expression for £.„ where T is substituted for r. See Fig.|9] 
for a graphical illustration of these expressions. 

IfO<r<r< 1, then this stationary imprecise Markov chain is regularly absorbing with 
regular top class {k} (under — >), and E_^(h) — Eoo{h) = h{k) for all real-valued maps h on 
Nevertheless, as soon as 7 = 1. iHartfieF s product scrambling condition is no longer 
satisfied, as the identity matrix will then belong to all 3f„ . 

The chain ceases to be regularly absorbing if r = and 7=1, and in that case it is easy 
to see that T^^"h{m) — max'^^^^Ji^i) for all n > and all real-valued maps h on and 
therefore the limit upper expectation Eoo will depend on the initial upper expectation Ei. 
For the particular initial expectation £1 we use in this example, we see that Eoo{h) = max/i. 

6.5. General models. When the (conditional) upper expectation operators that define an 
imprecise Markov chain do not fall into any of the special cases we discussed and illus- 
trated above, recourse must taken to more general calculation rules. 

Let us consider the typical case of a credal set 3^ that is specified by giving, for a 
finite number of real-valued maps / collected in the set C ^(^), consistent upper 
bounds U (/) on the expectations £(/). Then the upper expectation for any map h £ ^(^) 
can be found by solving the following linear program [see, e.g.,[33l Section 3.1.3]: 



E^{h) ■ 



subject to h < jj. 



(63) 



where 



A/ > and ;U e E. 

As the number of upper expectations to compute, and thus the number of linear pro- 
grams to solve, increases, it will eventually become profitable to take a second (dual) ap- 
proach. Any credal set specified by a finite number of constraints (bounds on expecta- 
tions) is a convex polytope, i.e., has a finite set ext !^ of extreme points. Vertex enumeration 
algorithms such as the one by Avis et al. IH can be used to obtain this set of extreme points 
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Figure 9. Upper failure probability {F„, full line) and lower failure 
probability {F_^, dashed line) for a 3-out-of-n:F system, for different num- 
bers of components n as a function of the imprecision e (7 — r)/2 of 
the component reliability, for three different values of r (? + r)/2. As 
can be expected, the failure bounds widen with increasing imprecision, 
decrease with increasing reliability (characterised by r), and increase for 
a greater number of components n. 



from the given set of constraints. We can then use a practical version of Eq. (fTsT i to find the 
corresponding upper expectations, namely [seeQ Section 3.1.3]: 

:=max{£'^(/!): ^eext^}. (64) 

We can now consider imprecise Markov chains where the local models, attached to 
the non-terminal situations in the tree, are of this type. The general backwards recursion 
formulae we have given in Section [3]can then be used in combination with the formulae 
of the type ( |63] | and (|64] | for the calculation of all conditional and joint upper and lower 
expectations in the tree. 

7. Conclusions 

To conclude, we (i) reflect on what type of convergence results could be obtained for 
imprecise Markov chains that are not regularly absorbing, (ii) we pay attention to the impo r- 
tant issue of interpretation of imprecise-probability models, and (iii) we compare iHartfief s 



approach [15] to our own regarding their practical applicability to deal with expectation 
problems. 

It is a reasonably weak requirement for a stationary imprecise Markov chain with upper 
transition operator T to be regularly absorbing, but we have seen that it is strong enough 
to guarantee that the upper expectation for the state at time n converges to a uniquely T- 
invariant upper expectation £00, regardless of the initial upper expectation E 1 . 

Even when an imprecise Markov chain is not regularly absorbing, it is not so hard to see 
that its upper transition operator T is still non-expansive under the supremum norm given 
for every h G if ( JT) by \\h\\^ := max\h\, as 

\\Tg-Th\\^<\m8-h)\\^<\\g-h\\^. (65) 

Moreover, the sequence ||T"/!||oo is bounded because ||T"/z||o<, < \\h\\oc. It then follows from 
non-linear Perron-Frobenius theory lEil [30I1 that the sequence T"h has a periodic limit 
cycle. More precisely, there is a g ^{^) such that T''''^/, = i.e., is a periodic 
point of T with (smallest) period ph, and such that T"P'^h — > (point-wise) as n °o. It 



22 



GERT DE COOMAN, FILIP HERMANS, AND ERIK QUAEGHEBEUR 



would be a very interesting topic for further research to study the nature of the periods and 
periodic points of upper transition operators. 

In our discussions, for instance in Section [3] we have consistently used the sensitivity 
analysis interpretation of imprecise-probability models such as upper expectations. Upper 
and lower expectations can also be given another, so-called behavioural interpretation, in 
terms of some s ubject 's dispositions towards accepting risky transactions. This is for in- 
stance "^^le^'s 1 199 ill preferred approach. The results we have derived here remain valid 
on that alternative interpretation, and the concatenation formulae (|2TI) and (l22l i can then be 
shown to be special cases of so-called marginal extension procedure lEsll . which provides 
the most conservative coherent (i.e., rational) inferences from the local predictive models 
Ti to general lower and upper expectations. In another paper [4], we give more details 
about how to approach a process theory using imprecise probabilities on a behavioural 
interpretation. 

On a related matter: the imprecise Markov chains we are considering here can be seen 
as special credal networks ll7il8l l24ll : the generalisation of Bayesian networks to the case 
where the local models, associated with the nodes of the network, are credal sets. The cor- 
responding 'independence' notion t hat shou ld then be used for the interpretation of the 
graphic al structu re of the network is IWallevl 's epistemic irrelevance Issl Chapter 9]. Inter- 
estingly, iHartfiell 's Markov set-chain approach corresponds to special credal nets where the 
independence concept involved is a different one: that of strong independence f^. Never- 
theless, both approaches yield the same results if we restrict ourselves to calculating the 
marginal upper expectations for variables X{n), as we have proved in Proposition l5.2l But 
in any case, for the actual calculation of expectations, the set of transition matrices ap- 
proach suffers from a combinatorial explosion of computational complexity that can be 
avoided using our upper transition operator approach. 
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Appendix A. Proofs 

In this Appendix, we have gathered proofs for the results in the paper. 

Before we go on, it will be useful to discuss and collect a number of properties of the 
upper transition operators associated wit h im p recis e Markov chains. They follow immedi- 
ately from the corresponding properties ( l£ll i- (l£'7l i of upper expectations, so we omit the 
proof. 

Proposition A.l (Properties of upper transition operators). Consider an imprecise Markov 
chain with a set of states Sj and upper transition operators Tj.. Then for arbitrary h, h\, 
h2, h„ in real A > and real jX: 

(Tl } I^ mmh < T/^h < I_^- maxh (boundedness); 

(T2) Tk{hi +h2) < T^/ii +T^./z2 (subadditivity); 

(Tit} Ti({Xh) = XTj^h (non-negative homogeneity); 

fT4) Tk{h + li-Ix) — Ti/i + lllsc (constant additivity); 

(T5) if hi < h2 then Tkhi < Tk^i (monotonicity); 

(T6) ifh„ — > /z point-wise then T<-/i„ — > T/,/z point-wise (continuity); 

fT7) Tjfh > —Tii{~h) — Tkh (upper-lower consistency). 

Consider any operator!: ^( JT) -> ^(^) that satisfies dTll-j^. Then foreachjc e ^, 
the real functional £(_U:) defi ned on .jSf (^) by E{h\x) — Th{x) is an upper expectation, 
because it satisfies ( l£'ll l- (l£'3l l. This means that we can consider T as an upper transition 
operator associated with some imp recise Markov chain. It therefore make sense to call any 
operator T that satisfies (IT1| i- (IT3| | an upper transition operator. Clearly, if Ti, ... T„ are 
upper transition operators, then so is their composition Ti . . . T„. 

We are now ready to proceed with the proofs of all results in the body of the paper. 

Proof of Theorem \3.1\ We first prove by induction that the left-hand sides are dominated 
by the right-hand sides in Eqs. (l2Tl i and (1221) . To get the induction process started, we 
observe that Eq. ( 1211 1 holds trivially for n = N — I. Next, we prove that if the desired 
inequality in Eq. ( 1211 1 holds for « = A;+ 1, it also holds for n^k, where k is any element in 
{1,2, ... — 2}. Let us fix xi± S then we have to prove that 

Eiflxuk) < fi.fi.+i . . .T^_i/(^i:^), (66) 

where we can use that, in particular, for all x<.+i € 

E{f\xi:k,Xk+l) <Tk+iTk+2---^N-lf{xi:k,Xk+l)- (67) 

We have fixed xi-k, so we can regard E{f\xi-k, •) as a real-valued map on depending 
only on the state X{k+ 1) at time k+1. We denote this map by hk+i. 

Now consider any compatible probability tree. In particular, let q{-\xi-k) G ^k{'\xk) be 
the corresponding local probability mass function for the uncertainty about the state X{k-\- 
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1 ) in the situation xi we are considering. It follows from the Law of Iterated Expectations 
that in this probabihty tree 

E{f\x,.,k)^E{E{f\x,.,t,-)\x,.,t), (68) 

and since E{f\xi±^-) < E{f\xi±,-) — /i^^+i, by definition of the upper expectations in 
the tree, we may derive from the monotonicity of expectation operators that E{f\xi-ii) < 
^(/ijt+i |jici:jt). Now, /z^+i is a function of X{k+ 1) only, so its conditional expectation 
E{hk+\ \x\±) in situation xi± can be calculated using the local conditional model q{-\x\-k) 
forZ(fe+l), i.e., 

E{hk+i\xi;k) = Y^h+i{xk+i)qixk+i\xi;k) < E i,{hk+i\xk) , (69) 

where the inequality follows from Eq. ( ITsT l. Hence E{f\xi±) < Ei^{hi^+\ \xii) and therefore 

E{f\xi;k) <Ek{hk+i\xk) ^Tkhk+iixk) 

< Tk {Tk+iTk+2 ■ ..TN-ifixi:k, •)) (xk) - • . .TN-ifixuk), (70) 

where the first inequality follows from the definition of the upper expectations in the tree, 
the first equa lity follows from Eq. ( fT9] ), the second inequality from Eq. ( |67] | and the mono- 
tonicity ( ITSI i of upper transition operators, and the second equality from Eq. (l20l i. 

In a completely similar way, but now using the model rather than the model - jx^-), 
we can prove that the desired inequalities hold for « = 0, given that they hold for n = 1 . So 
now we know that the left-hand sides are dominated by the right-hand sides in Eqs. (1211 1 
and ( |22] |. 

It remains to prove the converse inequalities. Fix any path in the tree. We denote the 
successive situations on this path by □, xi-i, x\-2, x\-[^_i, xi-f^. First, consider the 
situation xi:Af_i and the partial map % :— f{xi;N-i, then we know, because the credal 
set ^n-i{-\xn-i) is convex and closed, that there is some probability mass function in 
^N- i{'\xn-i), which we denote hy q{-\xi-N^[), such that 

Y,hN{xN)q{xN\xi:N-l) = E N_i{hN\xN-l) =^ TN-lf{xi:N-l , ■){xn-i) 

^%-lf{xi:N-l), (71) 

and therefore 

Tw-l/(xi:7V-l) = Y.fixuN^i,XN)q{xN\xi:N-l). (72) 

Next, consider the situation x\-n-2 and the partial map h^-i '■— TA?_i/(xi:Ar_2, •)■ Again 
we know, since J3n-2{-\xn^2) is convex and closed, that there is some probability mass 
function in ^n-2{'\xn^2), which we denote by q{-\xi-N_2), such that 

Y,hN-lixN-l)qixN-l\xi:N-2) ^ E N-2{hN-\\xN-2) = '^N-2 {'^N-\f{xi:N-2,-)) {xN-2) 
""-'^■^ ="^N-0yN-xf{xi:N-2) (73) 

and therefore 

Y^TN-\f{xi;N-2,XN-\)q{xN-\\x\:N-2) =^N-2"^N-\f{x\:N-2)- (74) 

If we combine Eqs ( f72] l and (f74l l. we find that 

Y^f{xi:N-2,XN-\:N)q{xN-\\xi:N-2)q{xN\x\:N-\) = ^N-2^N-lfixv.N-2)- (75) 

We can obviously continue in this manner until we reach the root of the tree. We have then 
effectively constructed a compatible probability tree for which the associated conditional 
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and joint expectation operators satisfy for all situations (n = 1, . . . ,N — 1) 

_ N-l _ 

E{f\xi:„) >E{f\xi;„) J^f{xi:„,X„+i;N)Y[4ixk+l\xi±) =T„T„+i...Ta,_i/(xi:„), 

x„+i:„G^'V-« k=n 

(76) 

_ N-l 

E{f)>E{f) := Y,f{xi:N)mixi)Yl^{xk+i\xi:k) = Ei{TiT2...TN-if). (11) 

x,:„G,r'V k=l 

This teUs us that the converse inequalities in Eqs. ( |2T1 ) and ( |22] | hold as well. □ 

Proof of Proposition 13. 21 We use Eq. (|2T]) . It is clear from the definition ( l20b of the T<. 
that if / is {n,n + I,... ,A^}-measurable, then Tn-i/ is {n,n + 1, . . . ,N — 1} -measurable, 
and then TN^2'^N-if is also {n,n + 1, . . . ,N — 2}-measurable; so by continuing the in- 
duction, we find T„_|_i . . . T^_i/ is {n,n + 1 }-measurable, and finally, T„ . . .Tyv-i/ is {«}- 
measurable. □ 

Proof of Corollar\ \3.3\ We use Eqs. (l2Tl l a nd (l22l l with / defined as follows: /{xi-^) := 
h{x„) for all xi ;v £ ■S^'^ . Then, also using ( IT3| |. the non-negative homogeneity of upper 
transition operators, we find after subsequently applying Tn^\, . . . , T(> that 

TN-\f{xy,N-\) ^TN-\{h{Xn)Ia:){xN-\) ^ h{x„) 



T„...Ta,_i/(xi:„) = T„(/!(x„)/r)(x„) =h{x„) 
T„_i ...TAr_i/(xi:„_i) = T„_i/z(x„_i) (78) 

T„_2 ■ ■ -^^N-lfixi-.n-l) = T„_2T„-l/l(x„_2) 

T(' . . . Ta,_i/(xi:£) = T^Tf'+i . . .T„_i/i(.x^), 

and therefore T( . . .Tf^_if{xi-(_i,-) = T^T^+i . . .T„_i/i. Applying Proposition 13.21 then 
leads to the first desired equality. If, for £ = 1, we now also apply the upper expectation £ i 
to both sides of this equality, the proof is complete. □ 

Proof of Proposition \3.4\ As an example, we prove Eq. (l24l i. by applying Eq. (|2TI) with its 
parameters chosen as f — /{x„^i.,„} and = m. We then see that for any zi-.m-i G 

~ 1 :,„- 1 } (^n+ 1 :™- 1 )^ni- 1 ^{x„ }{Zm-l) 

= '^K+,:„_|}(z«+l:m-l)Tm-l/{;c„j(jC„,-l), (79) 

where we have used the non-negative homogeneity ( ITSI i of upper transition operators. 
Therefore Tm_i/{^,,^,^, J =^{x„+i,,„_i}Tm-it{x,„}{xm-i)- Consequently, for any zi:„,_2 e 

Tm-2Tm-l/{;c^^j.^j(zi:,„-2) = T,„_2 (Tm-l/jj^^^j.^J (zi:„,-2)) (Zm-l) 

= "^m-l (^{-v„+i,„_2}fe+l:m-2)/{x„,_,}T,„-l/{.,,„}(jC™-l)) (z„,-2) 

= h^i,+ \:m-2} fe+l:m-2)T„,-l/{;r„} (Jt:,„_i )T„_2^{x„_i} (2m-2) 

= '^K+l:m-2} (^«+l:m-2)Tm-l/{x„} (-^m-l )T,„-2/{x„_i} (•^m-2), (^0) 

again using ( lT3l l. and therefore 

^m-2^m-ll{x„+i,„,} = 'f{x„+|:,„_2}Tm-l/{.v,„}(jCm-l)T,„_2/{;(,„_,}(-«m-2)- (81) 

Continuing in this fashion eventually leads to Eq. ( |24] |. □ 
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Proof of Proposition \4.3\ Suppose ^ 0. Consider any maximal state y [there always 
is at least one, because is finite] and any x G i^^, then it is clear from the definition 
of that y X. Since y is maximal, it follows that also x-^ y, and therefore x y. 
We conclude that is included in all maximal communication classes. This means that 
there is only one such maximal class, and is included in this top class. To show that 
is equal to this top class, consider any maximal element y and any x G Then we 
know that there is some « G N such that for all > n and all z G <^?^, z-^ x. But we have 
seen above that x <^ y, so there is some £>Q such that x y, and therefore z ^ y for all 
z G <!?^. This implies that y G so is indeed the top class. We show that it is regular. 
For each x in there is an G N such that y x for all k > and all y G If we 
define n := max {n^ ■ x G ^^}, then we see that x ~~> y for all k>n and all x,y G so 



^ is regular by Proposition |421 and therefore — w • is top class regular 
Conversely, assume that • • is top class regular Consider any state x in the top class, 
and any y & Then there is some iy > such that y x, and^it^^ollows from Proposi- 



tion |4j2]that there is some n G N such that x x and therefore y x for all > n. So if 
we let m :— n + max {£y: y & , then we see that y x for all A: > m and all y G and 
therefore x G whence ^0. □ 

Proof of Propositiongjl Fix x, y and z in JT. Since = T'"/{,,}(m) > for all m G ^, 
we have that 

^""hy} = LT"7{,} («)/{„} > T'"/{,j . (82) 

If we now apply the upper transition ope rator T n times to both sides of t his in equality, and 
repeatedly invoke its monotonicity ( ItsI i and non-negative homogeneity ( lT3l l. we find that 

T"+"7{,.} > T"7{,.} (z)T'7{,} and hence indeed T"+"7{y} (x) > T«/|^} (x)T'«/{j.} (z) . □ 



Proof of Proposition \4.5\ Fix x in ^ . Boundedness (iTll l and subadditivity (lT2b guarantee 
that < 1 < T"/s-(x) < lLyeX^"I{y}{x)- So there must be some y G for which = 
T«/{^}(x)>0. □ 

The following lemma provides a characterisation for top class regularity (under that 
is somewhat simpler than the one implicit in Proposition!; 



Lemma A.2. A stationary imprecise Markov chain is top class regular (under —>■) if and 
only if 

^-^ = {xG jT: (3«GN)(VyG S")^ Ax}7^0. (83) 

Proof Let := {x G : (3« g N)(Vy G ^)y x}, then by Proposition|43]it suffices 
to prove that = It is clear that S^,^ C ^i^, so we concentrate on the converse 
inequaUty. Consider any x G <^ and n G N such that y A x for all y G Then it suffices 
to prove that also y x for all y G X . Fix y, then there is some z G ^ such that P^- > 0, by 



Proposition l4.5l But since we know that for this z also 7?" > 0, we infer from Proposition l4.4l 
that indeed > P^^^ > 0. □ 



Before we come to the upper expectation form of the Perron-Frobenius theorem (Theo- 
rem l5.ll ). we first prove the following lemmas. 

Lemma A.3. Let T be an upper transition operator ass ociated with some stationary im- 
precise Markov chain, meaning that it satisfies (ITII i- JttI i. Consider any h G ^{,!%'). Then 
the real sequence minT"/i, n G N /s non-decreasing and converges to some limit 1(h) G R. 
Similarly, the real sequence maxT"/;, n G N /i non-increasing and converges to some limit 
L(h) G M. Of course, min/i < 1(h) < L(h) < maxh. 
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Pro of. F ix h in ^( j?r) an d con sider any n in Nq. From lof^ minT"/i < T"/; < /jr maxT"/z 
[by dli}] we deduce using © that T(/^ minT'7i) < T"+'/z < T(/je maxT"/i), and there- 
fore, using (Fol and 1^, that /.r minT"/z < T^+^h < maxT"/;. Consequently, 

min/z < minT"/i < minT"+'/z < maxT"+'/z < maxT"/z < max/i. (84) 

This tells us that the real sequence maxT"/? is non-increasing and bounded below (by 
min/z). It therefore converges to some real number L{h). Similarly, the real sequence 
minT"/z is non-decreasing and bounded above (by max/i), and therefore converges to 
some real number l{h). That min/i < l{h) < L{h) < maxh follows from the inequalities 
in Eq. ( l84l i by taking the limit « — > oo. □ 

Lemma A.4. Let T be an upper transition op erat o r ass ociated with some stationary im- 
precise Markov chain, meaning that it satisfies (ITII i-i IttI i. Consider any h G ^{^). Then 
there is some x,, in such that for alln £N there is some k„ > nfor which L{h) < T'^"/!(x„). 
Moreover, \im„^„oT''" h{x„) — limsup„^^T"/!(x„) = L{h). 

Proof. Suppose, ex absurdo, that for any x G j?r there is some n^ G N such that for all k > 
«x, T^h{x) < L{h). Since ^ is finite, this implies that there is some n := max {nx : x G 
such that for all k > n, maxT'^h < L{h). This contradicts the conclusion maxT"/! \ L{h) 
obtained in Lemma IA3] 

Next, we show that lim„^oo T*"/i(jc„) = L{h). For all n G N, L{h) < T*'"/i(x„) < maxT''"h, 
and since the subsequence maxT*"/i converges to the same limit L{h) as the convergent 
sequence maxT"/z, we see that the sequence T^'"/!(xo) converges to L{h) as well. 

To conclude, we show that limsup„^^T"/!(xo) = L{h). Since the limit superior of a 
sequence is the supremum of the limits of all its convergent subsequences, and since more- 
over we have just proved that lim„^ocT*"/z(x„) =L{h), we infer that limsup,j^^T"/i(xo) > 
L(/i). For the converse inequality: starting from T"h{xo) < maxT"/z and taking the limit su- 
perior on both sides of the inequality yields limsup„^^T"/i(x„) < limsup^j^^maxT"/; = 
L{h), where the equality follows from Lemma IA3] □ 

Lemma A.5. Let T be an upper transition operator associated with some stationary im- 
precise Markov chain, meaning that it satisfies (ITII i- JttI . Consider any h G If the 
imprecise Markov chain is regularly absorbing, then l{h) = L{h). 

Proof. Since the imprecise Markov chain is in particular top class regular (under — >), we 
have by Proposition 14.31 that ^ 0. Consider any x G then we first prove that 
lim„^„o T"/!(x) = l{h). We know from the definition of that there is some «x G N such 
that minT"'/{-,} > 0. Also, for any n > 0, 

< [T"/!(x) - minT"/i] < Th - minT'Vi, (85) 

and if we apply T n^ times to all sides of these inequalities, we get 

< [T'Vi (x) - minTh] T"> /{.,| < T"+" - mmTh, (86) 

after repeated use of (ITSI i. \TA\ and ( IT3| |. Taking the minimum of all sides of these inequal- 
ities leads to 

< [T"/!(x) - minT'Vi] minT"'7{.,} < minT"+"'/z - vamTh. (87) 

If we now let n ^ oo, we see that since the term on the right converges to zero [see 
Lemma [a. 31 . so must the middle term. Since minT"^ /{;i} > 0, this implies that T"/z(x) — 
minT"/; converges to zero, whence indeed lim„^ooT"/i(x) = lim„^oominT"/i = l{h). 

As a next step, we infer from Lemma [A~4l that there is some Xo in and some strictly 
increasing sequence A:„ of natural numbers, such that L{h) < T''"h{x„) for all n G N, and 
moreover limsup„^„T"/!(x„) = L{h). 

There are now two possibilities. The first is that G Then it follows from the dis- 
cussion above that ]im„^ooT" h{xo) — l{h). But since we also have that lim„_,ooT"/i(xo) = 
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lim„^„o T''"h{xo) = L{h), where the last equality follows from Lemma lA~4l we infer that in 
this case indeed l{h) = L{h). 

The second possibility is that x„ ^ but then it follows from the assumption that 
there is some n,, e N such that T"°/<^_^ (x^) > 0. We have for all n e N that 



0< 



maxT"/i — max T"h{y) 



L^^ <maxT"h-Th, (88) 



and if we apply T «„ times to all sides of these inequalities, we get 



maxT"/i — max T"h{y) 



r°/«^ (x,,) < maxT" /i - T''+''h{x„), (89) 



0< 

after repeated use of ( lT5l l, (lT4l) . ( lT3l l and ( |T7| |. some rearranging, and evaluating in x^. If 
we now take the limit inferior for n ^ °o of all sides in these inequalities, we find: 



0<T"°%_^(x„)liminf 



< Uminf [maxT"h - T''+"h{xo) 



maxT"/!- max Thiy) 

(90) 

Since maxT"/; L{h) and maXyg/^^ T"/!(3') l{h) [by the reasoning above, T"h{y) 
lih) for all y G ^.^], we infer that Hminf„^oc [maxT"/i - maXy^M^ T"/i(y)] = L{h) - /(/i) 
from the properties of the liminf operator . It also follows for similar reasons that 

liminf [maxT"/i-T"°+"/z(xo)] = lim maxT"/!-limsupT"°+"/!(x„) =L(/!) -L(/!). (91) 

So we infer from Eq. ( |90] l that T"°/^^ (xo) [L(/!) — l{h)] = 0, and therefore that also in this 
case l{h) = L{h), since by assumption T"°/(f^ (x^) > 0. □ 



Proof of Theorem \5.1\ Since /^minT"/i < T"h < /^-maxT"/?, and by Lemma PV. 51 both 
sequences minT"/i and maxT"/z converge to the same real Umit, which we denote by /i/,, 
it follows that T"/i converges (point- wise) to /jr/i/,: lim„ ^ooT "/i — /.£ ji/i ■ If we use the 
continuity of the upper expectation operator £ i , as well as (IT4| i and dTSl l. we get 

lim£i(T"-^/;) =£1 (limr-'/i) = = M/n (92) 

and this limit is indeed independent of the choice of £1. Hence we find for the limit that 

To complete the proof, consider any upper expectation £1 on ^(,^) and any h in 
^(^), then for all n e N, £1 (T"h) = Ei (T"-'T/i). If we let « 00 on both sides of this 
equality, we find that Eoo{h) — E„{Th), showing that E„ is indeed T-invariant. Now let Ei 
be any T-invariant upper expectation on J^{3^). Then we find for any h in ^(^), and 
for all n e N, that Ei{T"^^h) ~ Ei{h), and if we let « ^ 00 on both sides of this equality, 
we find that E^{h) ^Ei{h). □ 



Proof of Proposition \5.2\ We begin with the first statement. It clearly suffices to prove that 
for any A: e N, with obvious notations, 3^ ■ C £^m+i- In other words, consider any 
7? e 5^ and any S e 3^k, then we have to show that T RS e 3fk+i. By Eq. (|42li, e 5^ 
means that for allx e ,9^ there is some r(-|x) e =Sj(-|x) such that /J^y = >'{y\x) for ally € ^ . 
Similarly, by Eq. (gS), S ^ 3fk means that for all y € there is some G ^^k{'\y) 
such that Sy,, = r{z\y) for all z e S". Now for all x e ^ and all h G ^( JT), 

T'=+1/i(x) ==T(T*^/!)(x) 

>E,^.\,){Th)=Y^r{y\x)l'^h{y) 

> ^r(y|x)£,(.|,)(/i) =Y.r(y\x)Y,s{z\y)h{z) = ^R^SyMz) = Lt^.^z), 

vG.r yG,r zG,r y,zG.r zG,r 

where both inequalities follow from Eq. ( l40l l. If we now consider, for each x G the 
mass function q{-\x) given by q{z\x) := Tj^ = T^yesr ^{z\y)r{y\x) for all z G then this 
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means that T''+'/!(x) > £'^(.|^)(/!) for all h e if( JT), and therefore q{-\x) e (-Ix), for 

all X e ^\ by Eq. gOll. Hence indeed T G , by Eq. (l42l) . 

On to the second statement. We give a proof by induction. We first show that the state- 
ment holds for n = 1. We know from the definition ( l40t of ^j{-\x) and Eq. (|4TJ that for 
each X in ^' there is some q{-\x) e such that Th{x) — LvgS' ?(3'k)'^(3')- There- 

fore the transition matrix T, defined by T„ := q{y\x) for all x,y G belongs to 3y [see 
Eq. (|42] |1 and satisfies T/z(;c) = Lyg.sr Txyh(y) — {Th)^. 

Next, we show that if the statement holds for n — m [the induction hypothesis], it also 
holds for n~m+\, where m G N. Consider the real- valued map g := T'"/z, then T'"+'/i = 
Tg. We know from the reasoning above that there is some T\ ^ 3^ such that Tg(jc) = {T\g)x 
for all jc G =5?^. And the induction hypothesis tells us that there is some T2 G 5^" such that 
g[y) = T™/!(y) = (rj/;),. for all y G . Hence we find that for all x G ^: 

T'"+iKx)=Tg(x)=. l^{T,U,g{y) 

= L (rOxv £ (7'2)vz/i(z) = L (7'i7'2k-^(z) = (riFj/z),, (93) 

yG^ zgS' zG,jr 

and clearly Ti r2 G . This concludes the proof of the second statement. 

The third statement is an immediate consequence of the first and second statements. □ 

Finally, we turn to the proof of proposition l5.3l We first prove an alternative characteri- 
sation of the product scrambling property. 

Lemma A.6. A set ^ of transition matrices is product scrambling if and only if 

{3n G N) (Vfc > n) {W G J^'') {yx,y G ^'){3z G > A T,, > 0. (94) 

Proof Recall that ^ is called product scrambling if 

(3n G N)(Vr G ^")T(r) < 1. (95) 



Since the coefficient of ergodicity satisfies the submultiplicative property 11151 Section 1 .2] : 



'^{TiTi) < T(ri)T(r2) for all transition matrices Ti and 72, (96) 



we see that the product scrambling condition is equivalent to [see also 11151 Lemma 3.2] for 
a related result]: 

(3n GN)(V/t>n)(Vr G 3^'^)T(r) < 1. (97) 
Now use Eq. (|46] |. □ 

Proof of Proposition \5.3\ Assume that is product scrambling. We prove that this im- 
plies that the corresponding stationary imprecise Markov chain with upper transition oper- 
ator T is regularly absorbing: (a) it is top class regular and (b) for every y not in the top 
class there is some « G N such that V^I,^^ (y) > 0. 

We first prove that the Markov chain has a top class under — +. It follows from the 
characterisation ( |94l l of the product scrambling condition in Lemma IaTSI that 

{Vx,ye^){3ze^)x^zAy^z, (98) 

if we also take into account Proposition |52] For any x,y G C, where C C j?r is the [always 
non-empty] set of all maximal states, we know that x z^ z-^ x and y ~> z ^ z ^ y for 
all z G so we infer from Eq. (|98] | that both x ^ y and y x, so x and y communicate. 
This means that the whole of C forms one single communication class: C is the top class. 

We now show that this top class C is regular, i.e., consists of a single cyclic subclass, 
if we recall our discussion of periodicity in Section 14.11 Let dc be the period of the top 
class C, and consider any x and y in C. Using the same reasoning as above, we infer from 
Eq. ( |94l i and Proposition 15 . 21 that for large enough k: 

i3zk€C)x^ZkAy^Zk (99) 
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[that Zk &C follows from the fact that x and y are maximal]. Moreover, Proposition 14.11 
tells us that for large enough I and fj.j.i- + tdc £ N^,^x and t^^^y + t'dc £ N^i^y, and therefore 
also k + fj.^^ + tdc G A(v.r and A: + t^^^y + ^'t/c G Nyy. This implies that f^j,,^ = t^^, and therefore 
fvv — Q:x and y belong to the same cyclic class. This holds for all x,y G C, so C consists of 
only one cyclic class (under —^). The top class C is in other words aperiodic and therefore 
regular. This proves (a). 

To prove (b), assume the stationary imprecise Markov chain is top class regular but 
not regularly absorbing. We show that the set of transition matrices 3^ cannot be prod- 
uct scrambling. By Definition 14.11 we know that there is some ^ ^ \ such that 
T"/^^ (yo) = for all n G N. If we now also invoke Eq. (|43]) in Proposition l5.2l we see that 
for all n G N, there is some T* G such that: 

(VMG^-.)(r;),o„ = o. (100) 

Now consider any xq in the top class [this is possible since by assumption S^.^ ^ 0]. 
Since xq cannot communicate with any element outside we infer in particular from 
Eq. (03]) in Proposition|52]that for all n G N: 

(VvG jr\ ^_.)(r; = 0. (loi) 

But Eqs. ( II 001 ) and ( IIOII ) taken together imply [see Eq. (|46li] that x{T*) = 1 for all n G N, 
so the set 3^ is not product scrambling. □ 
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